This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter. Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I. When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file). The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
The number of registered private cars in Sweden for the years 1977
until February 2025, monthly data, is given in the second column of the
file carsmon.dat at Studium.
Find a suitable ARIMA (or SARIMA) model for these data, or a transformation thereof. Analyze the model residuals carefully, in order to make sure that the model provides a good description of the data.
It might be a good idea to try transformations, like the logarithm.
#########################
data = read.table("~/uni/analysis-time-series/carsmon.dat", header=TRUE)
x = log(data$count) # logarithm operation
automodel = auto.arima(x, d = 1, max.p = 4, max.q = 2, seasonal = T)
summary(automodel)
Series: x
ARIMA(4,1,1) with drift
Coefficients:
ar1 ar2 ar3 ar4 ma1 drift
1.1062 -0.3209 0.0409 -0.3194 -0.6301 1e-03
s.e. 0.0458 0.0630 0.0611 0.0422 0.0293 1e-04
sigma^2 = 9.633e-06: log likelihood = 2527.02
AIC=-5040.03 AICc=-5039.83 BIC=-5009.53
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 2.400795e-05 0.003084822 0.002192997 0.0001721209 0.01443169 0.4994104 -0.06969448
plot(data$count, type='l')
title(main = "Number of registered private cars in Sweden")
plot(x, type='l')
title(main = "Number of registered private cars in Sweden (log)")
par(mfrow=c(1,2)) # Making the display of 2 plots
acf(x, lag.max = 50) # ACF
pacf(x, lag.max = 50) # PACF cuts off after lag 1
## DIFFERENCING
lag = 12
dx = diff(x, lag=lag)
plot(dx, type='l')
title(main = paste("Number of registered private cars in Sweden diff =", lag, sep=" "))
par(mfrow=c(1,2)) # Making the display of 2 plots
acf(dx, lag.max = 50) # ACF
pacf(dx, lag.max = 50) # PACF cuts off after lag 1
## SEASONAL DIFFERENCING
# slag = 1
# sdx = diff(dx, lag=slag)
# plot(sdx, type='l')
# title(main = paste("Number of registered private cars in Sweden diff-1, and sdiff =", slag, sep=" "))
# par(mfrow=c(1,2)) # Making the display of 2 plots
# acf(sdx, lag.max = 50) # ACF
# pacf(sdx, lag.max = 50) # PACF cuts off after lag 1
#########################
# Oscillations in ACF, with peak-to-peak difference of around 11-12 timesteps (months)
# So ARIMA(11,12,0) model
smodel=arima(dx,
order=c(10,0,0),
seasonal=list(order=c(0,0,11),period=slag)
)
#10 - -5401 aic
#11 -
acf(smodel$residuals, lag.max = 50) # ACF
pacf(smodel$residuals, lag.max = 50) # PACF cuts off after lag 1
# Histogram
# par(mfrow=c(1,3))
hist(smodel$residuals)
qqnorm(smodel$residuals)
tsdiag(smodel, gof.lag = 50)
summary(smodel)
Call:
arima(x = dx, order = c(10, 0, 0), seasonal = list(order = c(0, 0, 11), period = slag))
Coefficients:
ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10 sma1 sma2 sma3 sma4 sma5 sma6
-0.1967 0.0891 0.2460 0.1350 0.1886 0.1832 -0.0018 0.0971 0.1433 -0.0126 1.0524 0.9987 0.8726 0.7520 0.6063 0.4779
s.e. 0.0726 0.0778 0.0756 0.0694 0.0663 0.0697 0.0803 0.0705 0.0696 0.0586 0.0602 0.1082 0.1470 0.1658 0.1662 0.1487
sma7 sma8 sma9 sma10 sma11 intercept
0.5000 0.4675 0.4238 0.4879 0.7036 0.0095
s.e. 0.1254 0.0996 0.0748 0.0562 0.0399 0.0043
sigma^2 estimated as 2.722e-06: log likelihood = 2813.58, aic = -5581.17
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 4.913448e-05 0.001649704 0.001095964 24.96375 47.29133 0.8496724 -0.002326466